[00:00.000 --> 00:14.740]  All right. All right, all right, all right. What's up, y'all? All right, welcome to my
[00:14.740 --> 00:20.800]  talk, IOT under the microscope here virtually. It's kind of disappointing. I'm
[00:20.800 --> 00:25.720]  looking at a screen here instead of all you in the in the audience, but hopefully
[00:25.720 --> 00:30.620]  we'll get through this and we'll have a good time doing it. I want to talk about
[00:30.620 --> 00:36.240]  vulnerability trends in the supply chain. I've got some very interesting things, I
[00:36.240 --> 00:41.040]  think, that we found in our in our data set size. So hopefully you guys will learn
[00:41.040 --> 00:46.480]  something here and we'll have a little fun while doing it. All right. Okay. So who
[00:46.480 --> 00:52.300]  am I? I'm Parker Wixel. I'm born and raised here in Columbus, Ohio. I've got 25
[00:52.300 --> 00:58.040]  years industry experience on cybersecurity, on software development,
[00:58.040 --> 01:03.240]  full stack development, last 10 years of which have really been focused on
[01:03.240 --> 01:08.240]  cybersecurity research and product development. I was a contributor and
[01:08.240 --> 01:13.120]  developer of open source security projects like AFL Unicorn. It's a fuzzing
[01:13.120 --> 01:19.680]  framework for emulated binaries and then Patchwork, which is a static
[01:19.680 --> 01:23.860]  compilation of Linux kernels or patching of Linux kernels for debugging
[01:23.860 --> 01:29.680]  purposes. Like it was mentioned, I'm a senior engineer at Finite State. We're an
[01:29.680 --> 01:35.040]  IoT cybersecurity firm dealing with a lot of the topics that we're talking
[01:35.040 --> 01:38.400]  about here and all my data sets and stuff like that come from our Finite
[01:38.400 --> 01:42.960]  State repos and some of the products that we're working on here. I'm also a
[01:42.960 --> 01:47.980]  database lecturer over at The Ohio State University, looking to kick off yet
[01:47.980 --> 01:53.160]  another fall there, and then I'm a composer and a musician. Don't hold that
[01:53.160 --> 02:01.140]  against me. I realized early on that there's not a lot of money in music, so
[02:01.140 --> 02:06.160]  here I am on just another passion of mine, computers. All right, so why is this
[02:06.160 --> 02:09.560]  talk relevant to your interests? We're going to be talking about supply chain
[02:09.560 --> 02:14.600]  trends, vulnerable and not vulnerable, vulnerability standards and reporting,
[02:14.600 --> 02:19.500]  and then some firmware statistics and observations. Probably about the first
[02:19.500 --> 02:25.000]  half of the talk is going to be delving into the background of what supply chains
[02:25.000 --> 02:30.060]  are, some of the ways that we have to talk about vulnerabilities, what the
[02:30.060 --> 02:35.520]  supply chain introduces as far as vulnerabilities and visibility into those,
[02:35.520 --> 02:40.500]  and then probably the last half will be delving into the fun numbers, probably
[02:40.500 --> 02:44.640]  why you came to see this talk, but nevertheless, hopefully we learned
[02:44.640 --> 02:52.120]  something in both parts. So our data set that I'm pulling from right now, we do
[02:52.120 --> 02:57.960]  have partnerships with some private industry partners, so we do not include
[02:57.960 --> 03:02.540]  all of our private repos and stuff like that, but for this particular talk, I've
[03:02.540 --> 03:09.520]  got about 7 million files, represents about 50,000 firmware images, 10,000
[03:09.520 --> 03:16.380]  distinct product lines, and 150 different vendors. This is different
[03:16.380 --> 03:20.480]  architectures, different operating systems, you know, a lot of them are Linux
[03:20.480 --> 03:26.660]  based, some of them are RTOS, obviously, and we're hitting all the different
[03:26.660 --> 03:31.980]  verticals that you usually hear about in these talks, medical devices, critical
[03:31.980 --> 03:38.840]  infrastructure, security devices, home routers, Alexa, you know, whatever. We've
[03:38.840 --> 03:42.480]  got a bunch of different types of products in our data set, so hopefully
[03:42.480 --> 03:47.900]  the statistics that we talk about here on the second half, keep this in mind,
[03:47.900 --> 03:54.300]  it's fun to be able to troll a data set of this size. So let's take a step back,
[03:54.300 --> 03:58.280]  let's talk about the supply chain, let's talk about the problems that are
[03:58.280 --> 04:02.600]  introduced as a part of the supply chain and what the problems are, maybe even go
[04:02.600 --> 04:09.560]  into some of the solutions. So if you are a manufacturer, XYZ, of a security
[04:09.560 --> 04:15.520]  camera, that security camera is running some sort of firmware on it. There's
[04:15.520 --> 04:18.260]  hardware and there's firmware. The firmware is actually software that's
[04:18.260 --> 04:21.460]  written for that hardware device. We like to think of it as firmware because it's
[04:21.460 --> 04:25.320]  kind of baked in, it's usually not as fluid or as dynamic as software tends to
[04:25.320 --> 04:32.580]  be in, say, a PC or whatever. But the camera still has a full processor memory
[04:32.580 --> 04:35.760]  architecture, it's a full computer running in that thing. So if you can take
[04:35.760 --> 04:40.200]  advantage of that or take over that product from a vulnerability standpoint,
[04:40.200 --> 04:44.260]  you've accessed a whole computer's worth of resources. You have hardware
[04:44.260 --> 04:47.760]  components that go in the thing, you have drivers that talk to those hardwares,
[04:47.760 --> 04:52.680]  operating systems, libraries, apps, you name it, they're all just the same,
[04:52.680 --> 04:57.500]  except they're called firmware. The problem is on a security camera like
[04:57.500 --> 05:01.460]  this or any kind of IoT device is you're going to have multiple vendors. So if
[05:01.460 --> 05:06.260]  you're a company XYZ, you're making this product, those hardware components
[05:06.260 --> 05:11.120]  may come from various different vendors. And then there may be other vendors
[05:11.120 --> 05:16.960]  like vendor A who talks to your underlying camera optics and can put some
[05:16.960 --> 05:20.700]  image recognition on top of it or whatever else like that. Vendor B may be
[05:20.700 --> 05:26.400]  some support libraries or whatever. The real problem comes in is not only do
[05:26.400 --> 05:29.440]  you have to track vendor A, vendor B, vendor C, and what all they're putting
[05:29.440 --> 05:34.380]  onto your device, which quite frankly is not always the case. There's not always
[05:34.380 --> 05:40.340]  full disclosure there. But then vendor A and vendor B may rely on an open source
[05:40.340 --> 05:45.240]  library somewhere, vendor X, that you don't even know about. They may or may
[05:45.240 --> 05:48.900]  not even know that they're using it, depending on if their developers have
[05:48.900 --> 05:53.900]  reported on it. And the thing that makes that even worse is that vendor X
[05:53.900 --> 06:00.160]  library, say it's the same low-level image processing library, it may be
[06:00.160 --> 06:06.020]  version 1.1 in vendor A and vendor B in their libraries that they're including
[06:06.020 --> 06:12.380]  on your device has 1.5. Maybe vendor A's 1.1 version of vendor X has a
[06:12.380 --> 06:17.040]  vulnerability in it and the 1.5 version doesn't. How do you know what all
[06:17.040 --> 06:20.980]  component libraries you have in this simple camera? This is a full computer
[06:21.540 --> 06:29.880]  running on your network, right? So Donald Rumsfeld, when he was Secretary of
[06:29.880 --> 06:34.060]  Defense, brought this notion together that information can be divided into
[06:34.060 --> 06:38.960]  three categories, he said at the time. Known knowns, known unknowns, unknown
[06:38.960 --> 06:43.800]  unknowns, all right? And so we take that kind of approach towards IoT
[06:43.800 --> 06:50.060]  vulnerabilities. Our known knowns are vulnerabilities that have explicitly been
[06:50.060 --> 06:54.180]  discovered through scanning and testing on our devices. So we've tested it, we
[06:54.180 --> 06:57.780]  know that there's a vulnerability, we patch it or whatever like that. So that's
[06:57.780 --> 07:03.300]  our known knowns, all right? Our known unknowns are newly created software
[07:03.300 --> 07:08.360]  versions or even just upgraded versions or whatever that we've pulled in
[07:08.360 --> 07:12.620]  libraries or whatever that we don't have any kind of application testing behind
[07:12.620 --> 07:18.760]  yet, right? So who knows what is going on under the hood there? So we know the
[07:18.760 --> 07:23.080]  device, we just don't know if there's any vulnerabilities there. And then in
[07:23.080 --> 07:29.500]  the last categorization of the three, unknown unknowns are vulnerabilities
[07:29.500 --> 07:36.040]  that are in your camera or your device that you don't know and that nobody else
[07:36.040 --> 07:40.140]  knows. And these are what we're calling zero days, all right? So these zero days
[07:40.140 --> 07:46.580]  or not zero days because we haven't even discovered them, like say
[07:46.580 --> 07:52.980]  Ripple 20 was just discovered a month or two ago. Before then, it was an unknown
[07:52.980 --> 07:58.600]  unknown. It was in all these different devices, but nobody knew it was there. But
[07:58.600 --> 08:04.000]  the weakness was still there waiting to be discovered, all right? So there's an
[08:04.000 --> 08:08.600]  awful lot of work to be done in discovering zero days. As a security
[08:08.600 --> 08:11.940]  researcher, we know that trying to protect yourself and get through all that
[08:11.940 --> 08:20.420]  is really a challenge trying to find out what is vulnerable or not. So for this
[08:20.420 --> 08:25.280]  talk, we're going to talk about known or unknown knowns. This is a fourth
[08:25.280 --> 08:29.600]  dimension that we like to talk about, and it's comprising that which we
[08:29.600 --> 08:35.100]  intentionally refuse to acknowledge that we know or we don't like to know, okay? So
[08:35.100 --> 08:39.120]  that there are vulnerabilities that are known to exist but have not been
[08:39.120 --> 08:43.680]  associated with all the systems that are actually affected. So we know all
[08:43.680 --> 08:48.440]  these CVEs are against this open SSL library, but we don't know if that SSL
[08:48.440 --> 08:52.700]  library is in our device, so we're just going to kind of ignore it for now, all
[08:52.700 --> 09:00.300]  right? So this unknown knowns is where we can do an awful lot on our part as
[09:00.300 --> 09:05.460]  manufacturers or security researchers to ferret out and to discover and to patch
[09:05.460 --> 09:12.520]  before other actors out there find those same vulnerabilities and test them
[09:12.520 --> 09:20.060]  against your same device, okay? So all of that can be done through a software
[09:20.060 --> 09:25.720]  bill of materials. So a software bill of materials or an SBOM is there's a bill
[09:25.720 --> 09:28.980]  of materials that usually comes in the manufacturing world that they're very
[09:28.980 --> 09:33.980]  used to of all the components that make up certain devices. So if you're getting
[09:33.980 --> 09:38.500]  big printing presses or whatever else like that, you know all the pieces that make
[09:38.500 --> 09:42.260]  up that printing press so you can plan maintenance, you know what the cost is
[09:42.260 --> 09:48.540]  going to be up front. So as a industry, software industry, IoT industry, we
[09:48.540 --> 09:52.900]  should have the same thing, the software bill of materials, but we don't, all
[09:52.900 --> 09:57.400]  right? So manufacturers don't know all their components, all the different
[09:57.400 --> 10:01.420]  chips, system on chips that are running inside their devices, all that kind of
[10:01.420 --> 10:05.080]  stuff. Vendors don't know all their suppliers and their suppliers' suppliers
[10:05.080 --> 10:10.020]  because a lot of times vendors will make a product, but then they'll ship it off
[10:10.020 --> 10:14.880]  to a manufacturer to actually make for them. And so how do you validate that,
[10:14.880 --> 10:19.720]  you know, that things are exactly as you designed them or whatever? And then
[10:19.720 --> 10:24.880]  consumers, those of us who put these devices in our critical infrastructure,
[10:24.880 --> 10:29.480]  into our security systems, into our monitoring software, our monitoring
[10:29.480 --> 10:36.200]  networks, even in our homes, we put these devices into our networks, but we don't
[10:36.200 --> 10:41.160]  know all the software that's running on those devices. So the analogy is like, if
[10:41.160 --> 10:48.620]  you found a laptop out in the parking lot of your company or whatever, would
[10:48.620 --> 10:55.060]  you bring that laptop in, fire it up, boot it up, and plug it into your
[10:55.060 --> 11:00.100]  critical infrastructure security system network just to poke around on the
[11:00.100 --> 11:05.200]  laptop and see what's running on it? And I would hope that most of you or all of
[11:05.200 --> 11:09.900]  you would say no, there's not a chance that you would ever do that, because we
[11:09.900 --> 11:14.060]  know that those are full computer systems with operating systems that can
[11:14.060 --> 11:20.680]  be compromised with viruses, malicious software, etc. But yet with that exact
[11:20.680 --> 11:26.520]  same mentality, we don't apply to IoT devices. We'll take a camera that we do
[11:26.520 --> 11:30.380]  not know all the supply, all the software that's running on it, and all the
[11:30.380 --> 11:33.840]  weaknesses that might be inherent on those softwares, and we'll take that
[11:33.840 --> 11:38.740]  camera and we'll plug that same camera into and talk to it on our critical
[11:38.740 --> 11:44.680]  infrastructure network. All right? So we don't have any way to enumerate this.
[11:44.680 --> 11:49.520]  So how do we generate one of these software build materials reliably? How do
[11:49.520 --> 11:55.300]  we keep track of all those? Say it was even possible to keep track of all the
[11:55.300 --> 12:00.080]  components that are in there, how do we validate one of our devices against one?
[12:00.080 --> 12:05.440]  So we as manufacturers may even develop a device, and we might know exactly what
[12:05.440 --> 12:09.280]  we want on it, but if we send it away and it comes back, who's to say that whoever
[12:09.280 --> 12:13.800]  made that device for us put our firmware on there exactly as it was intended, and
[12:13.800 --> 12:18.800]  they didn't slip something else in? The other thing would be, and this has
[12:18.800 --> 12:23.700]  happened, is if you as a consumer have a device and you want to update it to the
[12:23.700 --> 12:26.540]  latest update patch, say there's a security vulnerability and you want to
[12:26.540 --> 12:30.940]  patch it, you go to the manufacturer or the vendor's site and you download the
[12:30.940 --> 12:35.080]  update for that firmware, and you flash your device with that firmware, how do
[12:35.080 --> 12:39.680]  you know that software wasn't compromised? We have seen in the industry
[12:39.680 --> 12:45.420]  places where upload servers have been compromised by malicious actors, and
[12:45.420 --> 12:49.740]  custom firmware has been placed in there as updates, and customers have
[12:49.740 --> 12:53.500]  downloaded malicious updates to their devices, which might have been perfectly
[12:53.500 --> 12:58.280]  fine in the first place, but now are running malicious software. So not only
[12:58.280 --> 13:02.560]  generating a software bill of materials, but validating against a bill of
[13:02.560 --> 13:10.340]  materials is critical and being able to do that. So let's shift away. One thing
[13:10.340 --> 13:14.300]  I'd like to mention about company commitment and stuff like that is, I just
[13:14.300 --> 13:18.500]  read Microsoft has a product, Azure Sphere, that they're developing that's a
[13:18.500 --> 13:24.460]  secure IoT chip, and it's one way of approaching it. Hey, let's control the
[13:24.460 --> 13:28.960]  ecosystem from the beginning and secure it down. And they've made commitments to
[13:28.960 --> 13:33.220]  the software bill of materials. So it would be nice if more companies could be
[13:33.220 --> 13:39.040]  able to control their environment, such as Microsoft has the luxury to do. And I
[13:39.040 --> 13:43.000]  think these kind of commitments are going to be our way forward, is following
[13:43.000 --> 13:47.360]  practices like this, generating our own software bill of materials, taking a
[13:47.360 --> 13:53.300]  hard look at what's running on that. All right. So let's switch to today. We have
[13:53.300 --> 13:56.780]  these devices, we don't know what's running on them. Let's look at the CVE,
[13:56.780 --> 14:02.020]  CPE reporting mechanisms that we have for finding vulnerabilities in our
[14:02.020 --> 14:09.560]  software. Okay. So CVE is common vulnerability exposures. It's a system
[14:09.560 --> 14:12.300]  that provides reference methods for publicly known information security
[14:12.300 --> 14:17.060]  vulnerabilities and exposures. Okay. So it's been around a while. MITRE is the
[14:17.060 --> 14:21.420]  one who came up with that and helped maintain that. So these are all the
[14:21.420 --> 14:25.640]  vulnerabilities that are discovered and reported for public knowledge. All right.
[14:25.640 --> 14:29.660]  So that's great. We have a central place to report vulnerabilities. As well, we
[14:29.660 --> 14:33.980]  have a national vulnerability database with the U.S. government that keeps track
[14:33.980 --> 14:38.780]  of CPEs or common platform enumeration. So this is a structure naming schemes
[14:38.780 --> 14:43.840]  for naming products. Okay. This is systems, softwares, software packages,
[14:43.840 --> 14:50.140]  et cetera. So there's a common way to put all this together. The only problem
[14:50.140 --> 14:56.520]  is, is that we have these frameworks for enumerating these kinds of
[14:56.520 --> 15:00.860]  vulnerabilities or these products, but there's not a lot of adherence or,
[15:00.860 --> 15:06.680]  sorry, hard regulations around how we use this. It's a very flexible system and
[15:06.680 --> 15:11.620]  it works fairly well when treated well. But there's a lot of inconsistency
[15:12.380 --> 15:20.620]  across the whole space about how we use CPEs, how we report them, how we link
[15:20.620 --> 15:26.480]  them to products that they apply to, et cetera. So let's look at some of these.
[15:26.480 --> 15:32.000]  All right. Are CPEs just complete products? Is it your whole camera system?
[15:32.000 --> 15:36.760]  Is that whole device one product, a CPE? Are the component systems in there,
[15:36.760 --> 15:41.160]  your optics and stuff like that? Are the system on chips that are running on there,
[15:41.160 --> 15:45.440]  is that a CPE? Should that have its own product entry into that database?
[15:45.840 --> 15:50.900]  Are there libraries within? You're using OpenSSL. That's great. But what about
[15:50.900 --> 15:55.980]  libcrypto that lives within it? Could that be separate from OpenSSL? Could that be
[15:55.980 --> 16:02.600]  its own product? So there's an awful lot of questions that we need to answer with
[16:02.600 --> 16:09.560]  all that. So if we look at something like OpenSSL, it's a commercial-grade secure
[16:09.560 --> 16:15.720]  sockets layer. It also has cryptographic libraries in it. If you go to the NVD,
[16:15.720 --> 16:19.820]  the database, and look up OpenSSL, because you want to find out what product it
[16:19.820 --> 16:26.760]  relates to, because maybe you found a vulnerability, OpenSSL brings back 405 results.
[16:26.940 --> 16:32.740]  Now, every single version of that software is going to be another result. So 405 is not
[16:32.740 --> 16:38.540]  necessarily all the different types of OpenSSL, but there are several. So here's a few examples
[16:38.540 --> 16:45.200]  that come back with OpenSSL. So the very top one is usually what we think of as OpenSSL.
[16:45.200 --> 16:52.660]  It's a C++ library that is compiled into a lot of Linux systems and stuff with OpenSSL
[16:52.660 --> 16:59.000]  included. 0.9.7 was particularly vulnerable. There's a lot of different CVEs on there,
[16:59.000 --> 17:05.060]  0.9.8, 1.0.1, et cetera. So there's a lot of different versions of that. Those star
[17:05.060 --> 17:09.440]  fields, we'll just kind of gloss over for now, but those are further ways that you can
[17:09.440 --> 17:15.160]  enumerate or specify what specific product this is if there's a lot of different versions.
[17:15.700 --> 17:21.240]  Betas, alphas, different platforms, et cetera. But what are all these other CVEs or CPEs that
[17:21.240 --> 17:28.180]  we see out there? We see this Calderon PyOpenSSL. So the first part is our vendor. The second
[17:28.180 --> 17:34.280]  part is the actual product. So we have this PyOpenSSL. I guess we can make a guess that
[17:34.280 --> 17:41.120]  that would be a Python binding for OpenSSL or a Python library. Then we have Lua OpenSSL
[17:41.120 --> 17:46.980]  on the next line. So we guess maybe that's Lua script. But then look at the next two.
[17:46.980 --> 17:53.520]  We have Node OpenSSL. And if you go all the way down to the Node.js, that's the target
[17:53.520 --> 17:59.540]  software that it's targeting, not the language, but the target software, which is it's written
[17:59.540 --> 18:05.840]  for Node.js, right? So it's a JavaScript module written for Node. But the very next line,
[18:05.840 --> 18:13.960]  OpenSSL.js project OpenSSL.js. So you have Node OpenSSL and then you have OpenSSL.js
[18:13.960 --> 18:20.980]  and you have Node.js at the end. So now you have someone who's writing for Node that has
[18:20.980 --> 18:24.080]  their own way of specifying. But you have two different competing libraries. Which one
[18:24.080 --> 18:29.440]  is which? And then you go down to the last one, OpenSSL project OpenSSL. Well, now that
[18:29.440 --> 18:36.400]  looks an awful lot like the first one, OpenSSL. So which one is that? Well, the hard part
[18:36.400 --> 18:41.840]  is we really don't know. And if you dig into the metadata about this and you actually go
[18:41.840 --> 18:50.080]  to the web page, the title doesn't help you very much. OpenSSL project OpenSSL. So we
[18:50.080 --> 18:55.320]  go into the references and in the change log, we see a GitHub reference there. And in there
[18:55.320 --> 18:59.420]  we have a Rust OpenSSL. And if you go in there and you look at these different things, you
[18:59.420 --> 19:06.640]  have these keywords of Rust. Ah, that's for the Rust language. Okay? So finding qualified
[19:06.640 --> 19:13.620]  prod platforms, sorry, CPEs, can be a real challenge. The other thing that's a real challenge
[19:13.620 --> 19:22.220]  here is we don't know between CPEs what relates to each other. Does this Rust language depend
[19:22.220 --> 19:29.160]  on OpenSSL of a certain version, the C, C++ version of the first line, to be a certain
[19:29.160 --> 19:39.020]  version, right? Does 0.9.2 map to 0.9.7 or 0.9.7A or 1.01? We don't have these kinds
[19:39.020 --> 19:46.680]  of interrelations. So not only is generating an SBOM difficult, but developing a body of
[19:46.680 --> 19:53.700]  ground truth around an SBOM is extremely difficult. And it's not a problem that you think of because
[19:53.700 --> 19:59.800]  you think it should be obvious, but who owns that? Is it the company's responsibility to
[20:00.380 --> 20:06.320]  add their platform to the CPE database so that if vulnerabilities are found against it, that
[20:06.320 --> 20:12.620]  there's reliable data there? Or is it the CVEs who find the vulnerabilities? Is it their job
[20:12.620 --> 20:19.700]  to correctly find and identify the platforms if they're not there? So that's just a little
[20:19.700 --> 20:25.740]  background here on the CPE system. They're good systems to start off with, but we need more on
[20:25.740 --> 20:30.600]  top of that. We need some ground truth. We need some ways to relate this. By the way, there are
[20:30.600 --> 20:36.480]  some projects that are starting to put these together, but it's hard. Do you scrape the apt
[20:36.480 --> 20:47.380]  get repos for what components are in OpenSSL or HTTP servers? Certain HTTP servers rely on OpenSSL.
[20:47.380 --> 20:52.120]  Which versions go with which? Well, if you're installing it, you know because you look at the
[20:52.120 --> 20:58.080]  readme for the HTTP installer, right? And it'll tell you which versions you have, but we don't have
[20:58.080 --> 21:05.940]  any way of systematically obtaining any of that information. Okay, so let's go to a specific
[21:05.940 --> 21:11.600]  example of a vulnerability and let's look at the supply chain here. We'll go with an old example.
[21:11.600 --> 21:17.700]  This is from four years ago. I'm not going to try to butcher Robert's last name,
[21:17.700 --> 21:23.000]  but Darkonius, he released a write-up on a router backdoor that he had discovered
[21:23.880 --> 21:30.800]  originating from a version of OpenWrt, an open source router operating system from at that point
[21:30.800 --> 21:37.600]  was 10 years ago, was 10 years in the past. So there was this backdoor hash that's down in the
[21:37.600 --> 21:46.300]  code below, 1D, 680, whatever. If you entered in that on the command line, you immediately got a
[21:46.300 --> 21:52.520]  root shell and you can see the relevant code there. The problem is, is this was just in an
[21:53.240 --> 22:01.560]  OpenWrt version that was 10 years old from, I guess 2006, right? But the write-up that he had
[22:01.560 --> 22:07.480]  was this was not on an OpenWrt router, this was on a commercial router that had included
[22:08.700 --> 22:13.560]  OpenWrt and its libraries and were using components such as this component script called
[22:13.560 --> 22:24.220]  logon.sh as a part of their operating system. All right? So Darkonius found the backdoor in
[22:24.220 --> 22:29.660]  one or two sets of devices. Him and, and if you read through the comments, they, different people
[22:29.660 --> 22:34.120]  were chiming in, oh, I just tested it against this and I found it on this and this. So we found,
[22:34.120 --> 22:42.440]  you know, in the comments, three, four, maybe five models of devices listed. But how is Darkonius
[22:42.440 --> 22:47.480]  supposed to know and go out and find, is it his responsibility to go find every single device that
[22:47.480 --> 22:56.720]  this relates to? Is it even his obligation to go to the manufacturer of this device and tell him,
[22:56.720 --> 23:07.700]  I found this in there. If you go and you look for this CVE, you don't find this CVE in the database.
[23:07.760 --> 23:14.600]  If you go even further to the CPE, the platform for this device that he looked for, you don't
[23:14.600 --> 23:20.440]  even find CPEs that are related to this specific device. So, and this is not a small device. This
[23:20.440 --> 23:25.200]  is a, this is a, this is a device that made its rounds and you can, you can find it different
[23:25.200 --> 23:33.760]  places. So whose responsibility is it? In our dataset, when we've looked at all of our firmware
[23:33.760 --> 23:41.240]  just sitting around, we just did a string search for that magic hash. And lo and behold, we found
[23:41.240 --> 23:50.260]  3,810 files that had this hash in it. And it wasn't just the hash, it was the full login script.
[23:50.260 --> 23:57.820]  But notice the file name, CLI2, factoryboot.sh, login.sh. So the original login.sh was a part of
[23:57.820 --> 24:04.300]  the original package, but we found the actual login.sh encapsulated in a binary called a
[24:04.300 --> 24:11.320]  command line interface, a CLI, or even two that was used by web servers and stuff on IoT devices
[24:11.320 --> 24:18.900]  that still ran the same code from 2006 that nobody, I'm sure nobody knows is in there. The
[24:18.900 --> 24:25.200]  manufacturer doesn't even know it's there anymore. And the thing is, is that three major vendor
[24:25.200 --> 24:31.820]  companies have this. 44 different product models and 281 versions of this firmware. So this open
[24:32.600 --> 24:38.620]  wrt login.sh, this hash, has made its way into many, many, many different vendors. This isn't
[24:38.620 --> 24:43.920]  just one person accidentally downloading it and putting it in. This is the supply chain of people
[24:44.460 --> 24:50.500]  taking from A, who takes from X, who takes from Y, and it multiplies. So this is the heart of the
[24:50.500 --> 24:58.000]  problem. This is why we're in the bind that we're in, is the supply chain. We'll do one more example
[24:58.000 --> 25:04.760]  before we get into the more juicy stuff, the statistics that we found. This latest CVE, CPE
[25:04.760 --> 25:10.580]  that we're talking about is Ripple 20. I'm sure you guys have heard of it. 19 zero-day vulnerabilities
[25:10.580 --> 25:16.700]  amplified by the supply chain. Its title says it all. And this was two months ago that this was
[25:16.700 --> 25:23.060]  or so was reported. And according to their white paper on it, this affects hundreds of millions of
[25:23.060 --> 25:29.180]  devices or more, includes multiple remote code execution vulnerabilities. So that's the worst
[25:29.180 --> 25:34.640]  kind of vulnerability that you'd want to have. Many other major international vendors suspected
[25:34.640 --> 25:41.720]  of being vulnerable in medical, transportation, industrial control systems, etc. So my previous
[25:41.720 --> 25:49.540]  example kind of laid the groundwork for how something like Ripple 20 exists. All right.
[25:49.540 --> 25:55.960]  So let's just take one of the CVEs as an example that they reported. All right. So this was published
[25:56.600 --> 26:04.440]  June, two months ago, June 17th, 2020. The description of it, the tracked TCP IP stack
[26:04.440 --> 26:12.140]  before this version allows remote code execution related to IPv4 tunneling. Okay. So this is a
[26:12.140 --> 26:16.340]  vulnerability. They said it's bad. There's a base score of 10 critical on here because the remote
[26:16.340 --> 26:25.740]  code execution and how easy it is to theoretically get access to this TCP IP stack. If we look at
[26:25.740 --> 26:33.820]  the CPE that is related to all CVEs have all the CPEs that are related to them, we find these four
[26:33.820 --> 26:43.280]  TREC versions are all rolled up underneath this TREC stack CPE. The problem is, is when we look
[26:43.280 --> 26:50.620]  for this TREC stack and we go back to or we click on the actual record in the CPE database, we find
[26:50.700 --> 26:57.880]  a quick info is that the CPE for this TREC stack was created almost a month after the vulnerability
[26:57.880 --> 27:06.560]  was discovered. All right? So there was no product for TREC TCP. We had to invent, not we, but the
[27:06.560 --> 27:12.380]  vulnerability researchers had to invent the CPE that even belonged to the TREC stack, right?
[27:13.240 --> 27:20.560]  And then we go back to our original problem. If you have this TREC library and you're trying
[27:20.560 --> 27:27.080]  to figure out where all it's where all it's at in the world, you've got to kind of do some
[27:27.080 --> 27:31.020]  detective work. And they did some detective work because we don't have any CPE to CPE
[27:31.600 --> 27:36.180]  relationships. And even if we did, this CPE didn't even exist. Nobody even knew that TREC TCP
[27:36.720 --> 27:44.140]  exists. So I kind of imagine reading through the paper about this, I almost imagine
[27:45.420 --> 27:53.520]  Luis from Ant-Man 2, when he's being interviewed, where is Scott, you know? And I kind of imagine
[27:53.520 --> 27:58.760]  him talking about, you know, hey, Luis, where is TREC? And he goes, hey, man, see, that's
[27:58.760 --> 28:02.920]  complicated. See, TREC, there's some smart guys, but they need some help, right? So they go to
[28:02.920 --> 28:07.120]  Elmec Systems, and they're like, hey, we need some developers. And Elmec Systems is like, yeah,
[28:07.120 --> 28:10.860]  we think you've got it going on, so let's get together. And so they started working together
[28:10.860 --> 28:15.100]  but then something happened. They're like, nah, homie, we're split. We're out of here. So TREC
[28:15.100 --> 28:19.120]  goes off on their own way, and Elmec Systems go off on their own way. And TREC be like,
[28:19.120 --> 28:24.080]  we got the TREC TCP IP stack. We're going to market all this in the United States. And Elmec
[28:24.080 --> 28:28.900]  Systems are like, no, homie, we're no longer Elmec Systems. We're Zook and Elmec. And we got the
[28:28.900 --> 28:33.740]  CASAGO TCP IP. I know it sounds like yours, but it's better because it's been renamed. And you
[28:33.740 --> 28:42.580]  think you're hot? Dang, we got all of Asia that we're going through. Meanwhile, you got
[28:42.580 --> 28:48.100]  security researchers in the middle of the night going, hey, can you just tell me where TREC is?
[28:48.100 --> 28:53.700]  And TREC is like, hmm, see, it's complicated. And they go to Zook and Elmec. And see, they're
[28:53.700 --> 29:01.020]  just like, where's this CASAGO at? And we can't even figure it out because the legal system is
[29:01.020 --> 29:05.860]  cutting us off. Whereas TREC is back here going, let's see, I got an uncle's girlfriend, boyfriend
[29:05.860 --> 29:13.460]  who had a relative named HP. And it's kind of like the Rona virus, right? Like, where's the
[29:13.460 --> 29:20.680]  contact on this? Tracing. And their uncle's second cousin, once removed, is named Aruba.
[29:20.680 --> 29:24.960]  And I think we were at a party together. And meanwhile, Zook and Elmec is like,
[29:24.960 --> 29:30.060]  and JSOF is just like, where is this? And Zook and Elmec and TREC and all these people can't
[29:30.060 --> 29:33.880]  find out. And they're like, that's what I'm trying to say. It's complicated.
[29:36.600 --> 29:42.780]  Right? I'm sorry. I probably butchered that. But I can imagine industrial control systems
[29:42.780 --> 29:54.960]  and industrial plants that have CYP13 investigators or people coming in. I can see their first slide
[29:54.960 --> 30:01.020]  to the CYP13 auditors are going to be this picture of Luis and two words, it's complicated.
[30:01.300 --> 30:07.700]  Right? Because it is. Trying to find out where this TREC TCP stack was, according to this paper,
[30:07.700 --> 30:15.180]  pretty nuts. Okay? So, anyways, it was developed somewhere in the 90s to 2000. It was included in
[30:15.380 --> 30:19.800]  a large quantity of firmware for devices. There's been various patches to it. Some of the Ripple
[30:19.800 --> 30:29.140]  CVEs were patched in versions as far back as 2009. So, you're a vendor. What products of yours
[30:29.140 --> 30:34.160]  contain this TCP IP stack? How do you know? How do you update a device when a patch is present or
[30:34.160 --> 30:40.420]  even know what version of the TREC stack you used to have in your device? How serious is the threat
[30:40.420 --> 30:47.120]  to your device? Right? Maybe it's not that problematic. Maybe it was just included but
[30:47.120 --> 30:52.820]  never used. So, reproducing CVEs are problematic. The actual attack vectors are really hard to
[30:52.820 --> 30:58.760]  classify in these individual products. You potentially waste time from vendors on trivial
[30:58.760 --> 31:05.140]  vulnerabilities that really aren't practical. So, really, the bottom line for vendors are,
[31:05.140 --> 31:09.180]  how much money do you spend trying to find out whether you have to develop firmware patch for
[31:09.180 --> 31:13.460]  software that you don't even know is in your device? Or how likely is the software that it's
[31:13.460 --> 31:20.120]  even going to be exploitable? Right? So, it's a hard, hard question. So, I'll come back to the
[31:20.120 --> 31:25.600]  software building materials here in a little bit. But just for right now, let's make a little shift
[31:25.600 --> 31:32.500]  over into our database. And let's look at some of the vulnerability statistics that we have here.
[31:32.940 --> 31:37.600]  Hopefully, you guys find this enlightening. I certainly did. This is just kind of represents
[31:37.780 --> 31:46.120]  a slice of the industry. We have medical devices. We have information control systems
[31:46.120 --> 31:53.060]  gear. We have all sorts of different sectors represented and verticals in our data sector.
[31:53.060 --> 32:00.500]  So, here we go. All right. So, Etsy Resolve popular name servers, 127.001. Not surprising
[32:00.500 --> 32:08.780]  is one of the top ones. We got some Google addresses there. 88.844. And then the next three
[32:08.780 --> 32:16.420]  down there that you see 168 are all Asian DNSs. And then you've got some strange noise in there.
[32:16.480 --> 32:25.320]  192.168. Someone.com. A couple others. So, you know, if you have all these devices and they're
[32:25.320 --> 32:31.760]  going to the U.S. and they have Asian domain servers, you're going to have a performance hit
[32:31.760 --> 32:35.840]  or vice versa. How do you know where these products are going and which name servers are
[32:35.840 --> 32:45.340]  hitting? I saw some forum posts about 192.168.0.7. Maybe the people who made Etsy Resolve had it on
[32:45.340 --> 32:50.520]  test and they were just copying and pasting from a forum into this file. Someone.com,
[32:50.520 --> 32:56.540]  that actually was one vendor, but it's on a whole slew of firmware for all sorts of
[32:56.540 --> 33:01.220]  different products in their product line. So you can see not only is that illustrative that
[33:01.220 --> 33:05.220]  someone has just put that in as an example and they found that example somewhere,
[33:05.220 --> 33:09.940]  but it's in all their Etsy results and they just copy and paste that file into all their other
[33:09.940 --> 33:15.260]  firmware, right? So where's the checks and balances on what's actually running in there?
[33:15.880 --> 33:24.280]  Even stuff like corp.ubnt.com. Corp.ubnt.com doesn't exist, but it's running on a device as
[33:24.420 --> 33:30.260]  a resolve. So if someone decides to turn up that host name and then start serving requests on
[33:30.260 --> 33:36.020]  there, you could have some interesting results, right? Here's another one. TTY count in Etsy
[33:36.020 --> 33:42.680]  secure TTY. So if you look at the bottom, Etsy secure TTY file allows you to specify which TTY
[33:42.680 --> 33:50.280]  devices a root user is allowed to log in on. So let's just make one thing clear. Once you have an
[33:50.280 --> 33:57.240]  IoT device in production, you probably shouldn't be logging into that device at all. All right?
[33:57.240 --> 34:04.240]  So most normal operation shouldn't allow anybody from the outside to ever be tunneling into that
[34:04.240 --> 34:14.300]  device remotely. Certainly not a root user, okay? So I want to highlight this 196, 148. So what that
[34:14.300 --> 34:21.880]  means is that these counts are the number of shells, or sorry, the number of TTYs listed as
[34:21.880 --> 34:32.900]  approved TTYs that root can log into. So 1,486 firmware have file has an Etsy secure TTY in it
[34:32.900 --> 34:42.440]  with 196 approved TTY locations for root to log in on. All right, so I tracked this one down because
[34:42.440 --> 34:48.200]  this file is just like, where is this anomaly from? I found a Yocto project, which is a way to
[34:48.200 --> 34:53.980]  programmatically or to easily generate a Linux operating system custom build for your project.
[34:54.240 --> 35:00.340]  And this particular file that I found was from 2013. So here is a supply chain, someone who's
[35:00.340 --> 35:06.100]  building and wants a secure TTY, they pull down this file that's pretty permissive, probably for
[35:06.100 --> 35:10.940]  development purposes, and not really supposed to be for production. And yet here it makes its way
[35:10.940 --> 35:17.700]  into all these devices. So here's the supply chain in action. All right, number of valid login shells.
[35:17.700 --> 35:25.940]  Okay, so this is an Etsy shells. This is the number of acceptable different types of
[35:25.940 --> 35:32.120]  shells that you can log in from, and most have one. So again, whether you should have a shell that
[35:32.120 --> 35:37.660]  people can log into or not, that's questionable, but at least a lot of them, majority of them, have
[35:37.660 --> 35:44.860]  only one. But then we have this 10 login shells. And I guess what they want by this, a little tongue
[35:44.860 --> 35:50.040]  in cheek, is, you know, if their hackers are coming in and are used to have all their key bindings in
[35:50.040 --> 35:54.480]  their environment files for bash, they want to make it easier for them. So they're just going to throw
[35:54.480 --> 36:00.740]  in bash in there. And if you're ash or zsh, we got you covered. You can log in however you want.
[36:00.960 --> 36:07.620]  Again, a little tongue in cheek, but that's what that number represents. An additional thing here
[36:07.620 --> 36:15.060]  that was kind of concerning is when I got delved into the actual types of devices that this was,
[36:15.060 --> 36:23.500]  it was over 60 of these were models of patient monitoring health systems. Okay, so over 60
[36:23.500 --> 36:27.640]  different models, which represents however many thousands or hundreds of thousands of
[36:28.200 --> 36:36.060]  health systems, patient monitors that are sitting bedside, have 10 or so login shells that somebody
[36:36.060 --> 36:40.680]  could log in with, right? These are just configuration examples in the supply chain. I'm
[36:40.680 --> 36:46.580]  not saying these are actual vulnerabilities, they're just interesting anomalies from the supply chain.
[36:46.580 --> 36:51.200]  Okay, so here's a questionable config. We're about out of time. So we're just going to go
[36:51.200 --> 37:00.480]  through these quick. Firmware where the root user has a login shell, 15,345. All right, so 30%
[37:00.480 --> 37:05.540]  of all of our firmware that we looked at has some sort of login shell. Whereas if you don't want the
[37:05.540 --> 37:12.480]  root to log in, use sbin login, guess how many? Over under one. We found one firmware that actually
[37:12.480 --> 37:17.120]  had sbin no longer in there. Okay, firmware with keys and authorized keys. Now this is a bad back
[37:17.120 --> 37:23.180]  because this basically says if someone SSHs into your box, if your key matches their key
[37:23.180 --> 37:28.440]  in the authorized keys, you get to log in without any password. So this is a classic backdoor.
[37:28.440 --> 37:35.340]  Whether it was purposefully put there or not, what's the over under? 175, 175 firmware. Remember
[37:35.340 --> 37:41.500]  this is firmware, however many devices this represents. This is 12 different vendors with
[37:41.500 --> 37:47.700]  these keys. All right. And over 29, these were really interesting known hosts to see where these
[37:47.700 --> 37:52.500]  kind of came from and the different medical facilities and different places that had known
[37:52.500 --> 38:01.340]  hosts coming from them. Clam AV, only four firmware had any kind of open antivirus software.
[38:01.340 --> 38:11.200]  Firmware running HTTPD with mod auto index. So this auto index is a way to auto index files
[38:11.200 --> 38:17.940]  on your web browser or your web server when there isn't any kind of match for the HTML name. So it's
[38:18.020 --> 38:24.320]  a great way to kind of troll through your directories. So these are, again, things that are
[38:24.320 --> 38:31.280]  questionable that are in our configs. Firmware starting TTY and init tabs. So quite a few
[38:31.280 --> 38:37.500]  firmware are starting TTY, but not that many distinct files. So here we're seeing amplification
[38:37.500 --> 38:45.920]  in the supply chain of here's 63 files, but these 63 files are found in 4,729 devices.
[38:46.140 --> 38:52.700]  Here's a particularly bad one. Firmware with PHP to default to display errors. 332 firmware. So
[38:53.900 --> 39:00.300]  using SQL injection or any kind of command injection, I love when developers leave on
[39:00.300 --> 39:04.720]  their display errors, because when I made a mistake, it tells me exactly what's there.
[39:04.720 --> 39:10.320]  Problem is, PHP by default allows display errors. You have to explicitly turn it off.
[39:10.320 --> 39:15.640]  So maybe we should be making software that doesn't have defaults that are vulnerable like this in
[39:15.640 --> 39:22.240]  their default configuration. All right. Next part. Firmware with insecure default DES encryption. So
[39:22.240 --> 39:25.760]  if you don't specify the type of encryption for your passwords, you're going to use DES,
[39:25.760 --> 39:32.120]  which is a bit weaker. 318 firmware, 29 files. But even worse, there were firmware that specified
[39:32.120 --> 39:38.080]  the flag to turn on MD5 as their password tripped. It was 12 firmware. Fortunately,
[39:38.080 --> 39:41.860]  it's small, but there's still 12 firmware. I have no idea where those are at,
[39:41.860 --> 39:46.600]  but they're out there. Firmware with a master.password with no password on root. So in
[39:46.600 --> 39:54.060]  the password file, there is no over, under, on 100 devices, 10 devices, a million devices. Now
[39:54.060 --> 40:00.460]  only seven. But still, one file made its way in with no password on the root. MySQL binding to
[40:00.460 --> 40:08.280]  000. So listen to the world and allow people to go right to your MySQL. 26 devices. Firmware
[40:08.280 --> 40:13.060]  with Redis, the same way. Default world bind. So if you don't specify a bind by default, it will
[40:13.060 --> 40:18.320]  allow you in. Three. It's only one file. But still, three firmware allow you into their Redis by
[40:18.600 --> 40:25.020]  default. Average number of unsafe function calls per firmware. So this one is any unsafe function
[40:25.020 --> 40:32.560]  calls is stuff like stir copy, mem copy, all those unbounded ones like ripple 20 found and stuff
[40:32.560 --> 40:38.580]  like malicious ways to use standard allocation functions without checking links and stuff like
[40:38.580 --> 40:46.700]  that. How many do you think in per firmware? 1,500 and some average unsafe function calls
[40:46.700 --> 40:54.540]  per firmware. Okay? Number of firmware with unsafe function calls? 23,000. That's over 50%. And those
[40:54.540 --> 40:58.660]  are the ones that we've named. We haven't even trolled through those with function hashing and
[40:58.660 --> 41:06.500]  stuff to find the unlabeled ones that also call these. All right? Firmware. Here's the last one.
[41:06.500 --> 41:12.800]  Firmware exporting NFS mounts to the world. All right? Over, under on this one? 42. There are
[41:12.800 --> 41:20.540]  four files that contribute to 42 firmware mounts to the world, such as slash root, mount a star,
[41:21.040 --> 41:27.620]  so you can as root write to root as user ID zero. So these are all questionable configs that have
[41:27.620 --> 41:32.100]  made it their way, just fun things that I found trolling through. And finally, what would a talk
[41:32.100 --> 41:40.120]  like this be without talking about the common passwords and all that? So you see here amplified
[41:40.120 --> 41:46.160]  here, again, is the supply chain where you have the file counts of all of this, sort of the light
[41:46.160 --> 41:53.460]  orange, and the dark orange is how many firmware use those files in them. And the number one being
[41:53.460 --> 41:58.660]  admin. It's only in five files in the password file. But those five password files are found in
[42:01.100 --> 42:06.480]  470 firmware, which may represent hundreds of thousands of devices where we've cracked the
[42:06.480 --> 42:11.840]  password admin. I'm going to give you ten to one guess what the password is going to be for that
[42:11.840 --> 42:18.340]  admin user. All right. So anyways, in conclusion, it's all about the software bill of materials.
[42:18.340 --> 42:23.260]  Drive towards generating and validating these. This is the path forward. Manufacturing,
[42:23.260 --> 42:29.200]  implementing software inventory systems, developers being diligent and reporting all
[42:29.200 --> 42:34.660]  the components used in their systems. Consumers having a high demand or having a high standard
[42:34.660 --> 42:41.080]  in transparency and favor companies and buying from companies with that info as well. Policymakers
[42:41.080 --> 42:48.080]  who help guide these unified standards and reporting and formats. And our security community
[42:48.080 --> 42:53.900]  here developing tools to assist in the automation of the inventory. If we're doing the same thing now
[42:53.900 --> 42:59.060]  in five years, we're going behind. We should be using machine learning. We should be using tools
[42:59.060 --> 43:04.500]  and other things to help us inventory and validate these things, not just on the front end, but on the
[43:04.500 --> 43:09.060]  back end to validate patches and updates are what they say they are and they contain what they think
[43:09.060 --> 43:15.800]  they contain. So thanks to all my team at finite state and a lot of good research that's been in
[43:15.800 --> 43:24.080]  the industry. Finally, obviously the question and answer on this session are going to the discord,
[43:24.080 --> 43:31.140]  but I've had a fun time talking to you all. Peace out. Hope you enjoy the rest of DEF CON
[43:31.140 --> 43:33.740]  and IoT Village. Thank you so much for having me.
